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PRODUCTION OF PROTEINS BY AUTOPROTEOLYTIC CLEAVAGE 
Production of proteins 

The present invention relates to a process for the production of a desired heterologous 
polypeptide with a clearly defined homogeneous N-terminus in a bacterial host cell, wherein 
the desired heterologous polypeptide is autoproteolyBcally cleaved from an initially 
expressed fusion protein which comprises a peptide with the autoproteolytic activity of an 
autoprotease N^of a pestivirus and the heterologous polypeptide by the N"™ 
autoproteolytic activity. 

In the production of recombinant proteins in heterologous organisms such as the expression 
of human or other eukaryotic proteins in bacterial cells it is often difficult to obtain a clearly 
defined N-terminus which is as nearly 100% homogeneous as possible. This applies in 
particular to recombinant pharmaceutical proteins whose amino acid sequence ought in 
many cases to be identical to the amino acid sequence naturally occurring in 
humans/animals. 

On natural expression, for example in humans, many pharmaceutical proteins which are in 
use are transported into the extracellular space, and cleavage of the signal sequence 
present in the precursor protein for this purpose results in a clearly defined N-terminus. 
Such a homogeneous N-terminus is not always easy to produce, tor example in bacterial 
cells, for several reasons. 

Only in rare cases is export into the bacterial periplasm with the aid of a pro- or eukaiyotic 
signal sequence suitable, because it is usually possible to accumulate only very small 
quantities of product here because of the low transport capacity of the bacterial export 
machinery. 

However, the bacterial cytoplasm differs considerably from the extracellular space of 
eukaryotes. On the one hand, reducing conditions are present therein and. on the other 
hand, there Is no mechanism for cleaving N-terminal leader sequences to form mature 
proteins. The synthesis of all cytoplasmic proteins starts with a methionine which is 
specified by the appropriate start codon (ATG = initiation of translation). This N-termlnal 
methionine is retained in many proteins, while in olhers it is cleaved by the methionine 
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aminogeptidase (MAP) present in the cytoplasm and intrinsic to the host. The efficiency of 
the cleavage depends essentially on two parameters: 1. the nature of the following amino 
acid, and 2. the location of the N-terminus in the three-dimensional structure of the protein. 
The N-terminal methionine Is preferentially deleted when the following amino add is serine, 
alanine, glycine, methionine or valine and when the N-terminus Is exposed, i.e. not •hidden- 
inside the protein. On the other hand, if the following amino acid is a different one, in 
particular a charged one (glutamic acid, aspartic acid, lysine, arginine), or if the N-terminus 
is located inside the protein, in most cases cleavage of the N-terminal methionine does not 
occur (Knippers, Rolf (1995) Molekulare Genetik. 6th edition, Georg Thieme Vefcg % 
Stuttgart, New York. ISBN 3-13-103916-7). 

And even if an amino acid promoting the cleavage is present at position 2, the deavage is 
rarely complete. It is usual for a not inconsiderable proportion (1-50%) to remain unaffected 
by the MAP. 

In the early days of the production of recombinant pharmaceutical proteins in bacterial cells 
the procedure was simply to put a methionine-encoding ATQ start codon in front of the 
open reading frame (ORF) for the mature (i.e. without signal sequence or other N-terminal 
extension) protein. The expressed protein then had the sequence H2N-Met-target protein. 
Only in a few cases was it possible to achieve complete cleavage of the N-terminal 
methionine by the MAP intrinsic to the host. Most of the proteins produced in this way 
therefore either are Inhomogeneous in relation to their N-terminus (mixture of Met form and 
Met-f ree form) or they all have an additional foreign amino acid (Met) at the N-terminus 
(only Met form). 

This inhomogeneity or deviation from the natural sequence is, however, unacceptable in 
many cases because these products frequently show different immunological (for example 
induction of antibody formation) and pharmacological (half-flfe, pharmacokinetics) 
properties. For these reasons, it is now necessary in most cases to produce a nature- 
identical product (homogeneous and without foreign amino adds at the N-terminus). In the 
case of cytoplasmic expression, the remedy here in most cases is to fuse a cleavage 
sequence (leader) for a specific endopeptidase (for example factor Xa, enterokinase, KEX 
endopeptidases, IgA protease) or aminopeptidase (for example dipeptidyl aminopeptidase) 
to the N-terminus of the target protein. However, this makes an additional step, with 
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expenditure of costs and materials, necessary during further working up, the so-called 
downstream processing, of the product 

There is thus a need for a process for producing a target proton In bacterial cells, the 
intention being that the target protein can be prepared with a uniform, desired N-terminus 
without elaborate additional in vitro steps (refolding, purification, protease cleavage, 
renewed purification etc.). Such a process using the viral autoprotease from 
pestiviruses has been developed within the scope of the present invention. 

Pestiviruses form a group of pathogens which cause serious economic losses in pigs and 
ruminants around the world. As the pathogen of a notifiable transmissible disease, the 
classical swine fever virus (CSFV) is particularly important The losses caused by bovine 
viral diarrhoea virus (BVDV) are also considerable, especially through the regular 
occurrence of intrauterine infections of foetuses. 

Pestiviruses are small enveloped viruses with a genome which acts directly as mRNA and is 
12.3 kb in size and from which the viral gene products are transcribed in the cytoplasm. 
This takes place in the form of a single polyprotein which comprises about 4000 amino 
acids and which is broken down both by viral and by cellular proteases into about 12 mature 
proteins. 

To date, two virus-encoded proteases have been identified In pestiviruses, the autoprotease 
N pr0 and the serine protease NS3. The N-terminaJ protease N"* is located at the N-terminus 
of the polyprotein and has an apparent molecular mass of 23 kd. It catalyses a cleavage 
which takes place between its own Oterminus (Cys168) and the N-terminus (Ser169) of 
nucleocapsid protein C (R. Stark et at, J. Virol. 67 (1993), 7088-7095). In addition, 
duplications of the N^ gene have been described in cytopathogenic BVDV viruses. In these 
there is a second copy of N^ at the N-terminus of the likewise duplicated NS3 protease. An 
autoproteolytic deavage of the N^-NSS protein is observed in this case too (R. Stark et at, 
see above). 

N** 0 is an autoprotease with a length of 168 aa and an apparent Mr of about 20,000 d 

(in vivo). It is the first protein in the polyprotein of pestiviruses (such as CSFV, BDV (border 

disease virus) or BVDV) and undergoes autoproteolytic cleavage from the following 
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nucleocapsid protein C (M. Wiskerchen et aL, J. Virol. 65 (1991). 4508-4514; Stark et aL, J. 
Virol. 67 (1993), 7088-7095). This cleavage takes place after the last amino acid in the 
sequence of N** 0 , Cys168. 

It has now surprisingly been found within the scope of the present invention that the 
autoproteolytic function of an autoprotease of a pestivirus is retained in bacterial 
expression systems, in particular on expression of heterologous proteins. The present 
invention thus relates to a process for the production of a desired heterologous polypeptide 
with a dearly defined homogeneous N-temninus in a bacterial host cell, wherein the desired 
heterologous polypeptide is cleaved from an initially expressed fusion protein which 
comprises a peptide with the autoproteolytic activity of an autoprotease tT°oi a pestivirus 
and the heterologous polypeptide by the N 1 " 0 autoproteolytic activity. The invention further 
relates to cloning means which are employed in the process according to the invention. 

A polypeptide with the autoproteolytic activity of an autoprotease N^of a pestivirus or a 
polypeptide with the autoproteolytic function of an autoprotease N*™ of a pestivirus is, in 
particular, an autoprotease N pro of a pestivirus, or a derivative thereof with autoproteolytic 
activity. 

Within the scope of the present invention, the term 'heterologous polypeptide" means a 
polypeptide which is not naturally cleaved by an autoprotease of a pestivirus from a 
naturally occurring fusion protein or pofyprotein. Examples of heterologous polypeptides are 
industrial enzymes (process enzymes) or polypeptides with pharmaceutical, in particular 
human pharmaceutical, activity. 

Examples of preferred polypeptides with human pharmaceutical activity are cytokines such 
as interieukins, for example IL-6, Interferons such as leukocyte interferons, for example 
interferon o2B. growth factors, in particular haemopoietic or wound-healing growth factors, 
such as G-CSF, erythropoietin, or IGF, hormones such as human growth hormone (hGH), 
antibodies or vaccines. 

In one aspect, the present Invention thus relates to a nucleic acid molecule which codes for 
a fusion protein where the fusion protein comprises a first polypeptide which has the 
autoproteolytic function of an autoprotease N** 0 of a pestivirus, and a second polypeptide 
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which is connected to the first polypeptide at the (Merminus of the first polypeptide in a 
manner such that the second polypeptide is capable of being cleaved from the fusion 
protein by the autoproteofytic activity of the first polypeptide, and where the second 
polypeptide is a heterologous polypeptide. 

The pestMrus for this purpose is preferably selected from the group of CSFV. BDV and 
BVDV. with CSFV being particularly preferred. 

A preferred nucleic acid molecule according to the invention is one where the first 
polypeptide of the fusion protein comprises the following amino add sequence of the 
autoprotease N"" of CSFV (see also EMBL database accession number X87939) (amino 
acids 1 to 168, reading from N-terminal to the C-terminal direction) 

( 1 ) -MELNHFELLYKTSKQKPVGVEEPVYDTAGRPLFGNPSSVHPQSTIiKIjPHDRGRGDIRTTI»RDL 
PRKGTCRSGNHLGPVSGIYIKPGPVYYQDYTGPVYERAPLEFPDEAQFCBVTKRIGRVTGSDGKLYH 

IYVCVDGCILLKLAKRGTPRTLKWIRNFTNCPLWVTSC- (168) , 



or 



the amino acid sequence of a derivative thereof with autoproteolytic activity. 



Derivatives with autoproteolytic activity of an autoprotease N"" of a pestivirus are those 
autoproteases tT° produced by mutagenesis, in particular amino acid substitution, deletion, 
addition and/or amino acid insertion, as long as the required autoproteolytic activity, in 
particular for generating a desired protein with homogeneous N-terminus. is retained. 
Methods for generating such derivatives by mutagenesis are familiar to the skilled person, ft 
is possible by such mutations to optimize the activity of the autoprotease N"™. for example, 
in relation to different heterologous proteins to be cleaved. After production of a nucleic acid 
which codes for a fusion protein which, besides the desired heterologous protein, comprises 
an autoprotease derivative which exhibits one or more mutations by comparison with a 
naturally occurring autoprotease NP*°, It is estabfehed whether the required function Is 
present by determining the autoproteolytic activiy in an expression system. 

The autoproteolytic activity can. for example, infifeny be detected by an in vitro system. For 
this purpose, the DNA construct is transcribed into RNA and translated Into protein with the 
aid of an in vitro translation kit. In order to increase the sensitivity, the resulting protein is in 
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some cases labelled by incorporation of a radioactive amino acid. The resulting IsT-target 
protein fusion protein undergoes co- and/or post-translational autocataJytic cleavage, there 
being accurate cleavage of the N-terminal N" portion by means of its autoproteolytic 
activity from the following target protein. The resulting cleavage products can easily be 
detected, and the mixture can be worked up immediately after completion of the In vitro 
translation reaction. The mixture is subsequently loaded onto a protein gel (for example 
Lammli SDS-PAGE) and subjected to electrophoresis. The gel is subseqeuntly stained with 
suitable dyes or autoradiographed. A Western blot with subsequent immunostaining Is also 
possible. The efficiency of the cleavage of the fusion protein can be assessed on the basis 
of the intensity of the resulting protein bands. 

In a further step, the nucleic acid fragment for the fusion protein can be cloned into a 
bacterial expression vector (if this has not already happened for the in vitro translation) and 
the latter can be transformed into an appropriate host (e.g. E. coli). The resulting expression 
strain expresses the fusion protein constitutively or after addition of an inducer. In the latter 
case it is necessary to cultivate further for one or more hours after addition of the inducer in 
order to achieve a sufficient litre of the product. The N"* 0 autoprotease then cleaves Hself 
co- or post-translationally Irom the expressed fusion protein so that the resulting cleavage 
fragments are the N"° autoprotease per se and the target protein with defined N-termlnus. 
To evaluate the efficiency of this cleavage reaction, a sample is taken after the end of the 
cultivation or induction phase and analysed by SDS-PAGE as described above. 

A preferred autoprotease NT" derivative of the described fusion protein has. for example, an 
N-terminal region in which one or more amino acids have been deleted or substituted in the 
region of amino acids 2 to 21 as long as the resulting derivative continues to exhibit the 
autoproteolytic function of the autoprotease N"° to the desired extent In the context of the 
present invention, autoprotease tT derivatives which are preferred in the fusion protein 
comprise, for example, the amino acid sequence of the autoprotease tTot CSFV with a 
deletion of amino acids 2 to 16 or 2 to 21. It Is also possible by amino acid substitution or 
addition to exchange or introduce amino acid sequences, for example in order to introduce 
an amino arid sequence which assists purification (see examples). 



A partfculariy preferred nucleic add molecule according to the present Invention Is o 
where the first polypeptide comprises the amino acid sequence GUi22 to Cys168 of 
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autoprotease N"™ of CSFV or a derivative thereof with autoproteolytic activity, the first 
polypeptide furthermore having a Met as N-terminus, and the heterologous polypeptide 
being connected directly to the amino acid Cys168 of the autoprotease Nf of CSFV. 

A likewise preferred nucleic acid molecule according to the present invention is one where 
the first polypeptide comprises the amino acid sequence Pro17 to Cys168 of the 
autoprotease N"" of CSFV or a derivative thereof with autoproteolytic activity, the first 
polypeptide furthermore having a Met as N-terminus. and the heterologous polypeptide 
being connected directly to the amino acid Cys168 of the autoprotease N 5 " of CSFV. 

A nucleic acid molecule according to the invention is. in particular, in the form of a DNA 
molecule. 

The present invention further relates to cloning elements. In particular expression vectors 
and host cells, which comprise a nucleic acid molecule according to the invention. Hence 
the present invention further relates to an expression vector which is compatible with a 
predefined bacterial host cell, comprising a nucleic add molecule according to the Invention 
and at least one expression control sequence. Expression control sequences are. in 
particular, promoters (such as lac. tac. T3. T7, trp. gac, vhb. lambda pL or phoA). ribosome 
binding sites (for example natural ribosome binding sites which belong to the 
abovementioned promoters, cro or synthetic ribosome binding sites), or transcription 
terminators (for example rmB T1T2 or bla). The above host cell is preferably a bacterial cell 
of the genus Escherichia, in particular E. coli. However, it is also possible to use other 
bacterial cells (see below). In a preferred embodiment, the expression vector according to 
the Invention is a plasmid. 

The present invention further relates to a bacterial host cell which comprises an expression 
vector according to the invention. Such a bacterial host cell can be selected, for example, 
from the group of the following microorganisms: Gram-negative bacteria such as 
Escherichia species, for example E. coli. or other Gram-negative bacteria, for example 
Pseudomonas sp., such as Pseudomonas aeruginosa, or Caulobacter sp., such as 
Caulobacter crescentus, or Gram-positive bacteria such as Bacillus sp.. in particular Bacillus 
subtilis. E. coli is particularly preferred as host cell. 
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The present invention further relates to a process for the production of a desired 
heterologous polypeptide, comprising 

(i) cultivation of a bacterial host cell according to the present invention which comprises an 
expression vector according to the present invention which in turn comprises a nucleic acid 
molecule according to the present invention, wherein cultivation occurs under conditions 
which cause expression of the fusion protein and further autoproteolytic cleavage of the 
heterologous polypeptide from the fusion protein in the host cell by the autoproteolytic 
activity of the first polypeptide, and 

(ii) isolation of the cleaved heterologous polypeptide. 

The process according to the invention is carried out in principle by initially cultivating the 
bacterial host cell, i.e. the expression strain, in accordance with microbiological practice 
known per se. The strain is generally brought up starting from a single colony on a nutrient 
medium, but it is also possible to employ cryopreserved cell suspensions (cell banks). The 
strain is generally cultivated in a multistage process in order to obtain sufficient biomass for 
further use. 

On a small scale, this can take place in shaken flasks, it being possible in most cases to 
employ a complex medium (for example LB broth). However, it Is also possible to use 
defined media (for example citrate medium). For the cultivation, a small-volume preculture 
of the host strain (inoculated with a single colony or with a cell suspension from a 
cryoculture) is grown, the temperature for this cultivation not generally being critical for the 
later expression result, so that it is possible routinely to operate at relatively high 
temperatures (for example 30»C or 37'C). The main culture is set up in a larger volume (for 
example 500 ml), where it is in particular necessary to ensure good aeration (large volume 
of flask compared with the volume of contents, high speed of rotation). Since it Is Intended 
that expression take place in soluble form, the main culture will in most cases also be 
carried out at a somewhat lower temperature (for example 22 or 28°C). Both Inducible 
systems (for example with bp. lac. tac or phoA promoter) and constitutive systems are 
suitable for producing soluble proteins. After the late logarithmic phase has been reached 
(usually at an optical density of 0.5 to 1.0 In shaken flasks), in inducible systems the inducer 
substance (for example Indoleacrylic acid, isopropyl p-D-thiogalactopyranoside = IPTG) Is 
added and incubation is continued for 1 to 5 hours. The concentration of the inducer 
substance will In this case tend to be chosen at the lower limft In order to make careful 
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expression possible. During this time, most ot the N^-target protein fusion protein is 
formed, there being co- or post-translational deavage of the tf portion so that the two 
cleaved portions are present separately after the end of cultivation. The resulting cells can 
be harvested and processed further. 

On a larger scale, the multistage system consists of a plurality of bioreactors (feimenters), it 
being preferred to employ defined nutrient media in this case in order to be able to improve 
the process engineering control of the process. In addition, it is possible greatly to increase 
biomass and product formation by metering in particular nutrients 0ed batch). Otherwise, 
the process is analogous to the shaken flask. For example, a preliminary stage fermenter 
and a main stage fermenter are used, the cultivation temperature being chosen similar to 
that in the shaken flask. The preliminary stage fermenter is inoculated with a so-called 
inoculum which is generally grown from a single colony or a cryoculture in a shaken flask. 
Good aeration and a sufficient inducer concentration must also be ensured in the fermenter 
- and especially in the main stage thereof. The induction phase must, however, In some 
cases be made distinctly longer compared with the shaken flask. The resulting cells are 
once again delivered for further processing. 

The heterologous target protein which has been cleaved from the fusion protein can then 
be isolated by protein purification methods known to the skilled person (see. for example, 
M.P. Deutscher. in: Methods in Enzymology: Guide to Protein Purification. Academic Press 
Inc., (1990), 309-392). A purification sequence generally comprises a cell disruption step, a 
clarification step (centrifugation or mlcrofiltration) and various chromatographic steps, 
f iltrations and precipitations. 

The following examples serve to illustrate the present invention, without in anyway Dmiting 
the scope thereof. 




The plasmid NPC-pET is constructed for expression of an N^-C fusion protein in a bacterial 
host. The expression vector used is the vector pET1 1a (F.W. Studler et al.. Methods. 
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Enzymol. 1 85 (1 990). 60-89). The natural structural gene (from the CSFV RNA genome) for 
the IsT-C fusion protein is cloned into this expression vector. The structural gene for this 
fusion protein is provided by PCR amplication from a viral genome which has been 
transcribed into cDNA (and cloned Into a vector). Moreover the first 16 amino acids of the 
natural N^-sequence (MELNHFELLYKTSKQK) are replaced by a 10 amino acid-long oligo- 
histidine purification aid (MASHHHHHHH). The resulting construct Is called NPC-pET. The 
sequence of the N pro portion and the autoproteolytic cleavage site of the N^-C fusion 
protein encoded on the NPC-pET has the following structure, with the cleavage site being 
located between the amino acids Cys168 and Ser(169): 

MASHHHHHHHPVGVEEPVYDTA^ 
HIX3PVSGIYIKPGFVYYQDYTGPVYHRAP 

lkijuuigtprtlkw^^ protein C) 

In the sequence, proline 17 (position 2 of the fusion protein) from the natural HT sequence 
is put in italics, and the start of the C sequence is printed in bold. The fusion protein has an 
approximate M r of 32 kd, with the N 1 " 0 portion accounting for 18 kd and the C portion 
accounting for 14 kd after autoproteolytic deavage. 

In order to evaluate the significance of the first amino acid C-terminal of the cleavage site, 
the serine 169 which is naturally present there Is replaced by the 19 other naturally 
occurring amino acids by targeted mutagenesis. The constructs produced thereby are 
called NPC-pET-Ala, NPOpET-Gly etc. Tne expression strains are produced using these 
plasmids. 

Escherichia coll BL21 (DE3) is used as Escherichia coli host strain for expression of the 
N^-C fusion proteins. This strain has the foflowing genotype: 

EcoHBF dcm ompT hsdS^b^l ga/MDE3) 

The strain is commercially avaflable in the form of competent cells from Stratagene. It 
harbours a lysogenic lambda phage in the genome which comprises the gene for T7 RNA 
polymerase under the control of the iacUVS promoter. Production of the T7 RNA 
polymerase and consequently also of the target protein can thus be induced by addition of 
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isopropyl p-D-thiogalactopyranoside (IPTG). This two-stage system permits very high 
specific and absolute expression levels for many target proteins to be achieved. 

The expression strains BL21 (DE3)f.MPC-pE"n, BL21 (DE3)[MPC-pET-Ala] etc. are produced 
by transforming the respective expression plasmid into BL21(DE3). The transformation 
takes place in accordance with the statements by the manufacturer of the competent cells 
(Stratagene or Novagen). The transformation mixture is plated out on Luria agar plates with 
100 mg/l ampicillin. This transformation results in numerous clones in each case after 
incubation at 37°C (overnight). 

A medium-sized colony with distinct margins is picked and forms the basis for the 
appropriate expression strain. The done is cultured and preserved in cryoampules at -80'C 
(master cell bank MCB). The strain is streaked on Luria agar plates (with ampicillin) for daily 



use. 



The particular strain is used for inoculating a preculture in a shaken flask from a single 
colony subcultured on an agar plate. An aliquot of the preculture is used to inoculate a main 
culture (10 to 200 ml in a shaken flask) and raised until the ODeoo Is from 0.5 to 1 .0. 
Production of the fusion protein Is then Induced with 1.0 mM IPTG (final concentration). The 
cultures are further cultivated for 2-4 h, an ODeoo of about 1 .0 to 2.0 being reached. The 
cultivation temperature is 30*C +/- 2°C, and the medium used is LB medium + 2g/l glucose 
+ 100mg/l ampicillin. 

Samples are taken from the cultures before induction and at various times after induction 
and are centrifuged, and the pellets are boiled in denaturing sample buffer and analysed by 
SDS-PAGE and Coomassie staining or Western Wot The samples are taken under 
standardized conditions, and differences in the density of the cultures are compensated by 
the volume of sample loading buffer used for resuspenslon. 

The bands appearing after induction are located at somewhat above 20 kd (N"*) and at 
about 14 kd (C). The efficiency of cleavage of the fusion protein with each construct is 
estimated on the basis of the Intensity of the bands in the Coomassie-stalned gel and in the 
Western blot. It Is found from this that most amino acids are tolerated at the position 
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immediately C-terminal of the cleavage site (i.e. at the N-terminus of the target protein), i.e. 
very efficient autoproteolytic cleavage takes place. 

These data show that it Is possible in principle to employ successfully the autoproteolytic 
activity of the autoprotease r4 pro for the specific cleavage of a recombinant fusion protein in 
a bacterial host cell. 

Example 2: Expression and in-vivo cleavage of a fusion p rotein of N^and human 
interleukin 6 (hlL61 to produce homogeneous m ature hlL6 

The plasmid NP6-pET is constructed for expression of the N^-hlUS fusion protein. pET1 1a 
(F.W. Studieretal., Methods. Enzymol. 185 (1990), 60-89) is used as expression vector. 
Firstly a fusion protein consisting of N pn> and the CSFV nucleocapsid protein is cloned Into 
this expression vector (see Example 1). The structural gene for this fusion protein is 
provided by a PCR. This entafls the first 16 aa of the natural N*" 0 sequence 
(MELNHFELLYKTSKQK) being replaced by a 10 aa-long oligo-histidine purification aid 
(MASHHHHHHH). 

An Spel cleavage site is introduced into the resulting expression plasmid at the junction 
between and nucleocapsid protein by targeted mutagenesis. This makes it possible to 
delete the structural gene for the nucleocapsid protein from the vector by restrictions with 
Spel at the 5' end (corresponding to the N-terminus of the protein) and Xhol at the 3' end 
(corresponding to the Oterminus of the protein). The corresponding linearized N^-pETI 1a 
vector Is removed from the nucleocapsid gene fragment by preparative gel electrophoresis. 
It is then possible to introduce the hlL6 structural gene via the "sticky" Spel and Xhol ends. 

The following preparatory work is necessary for this. The structural gene is amplified with 
the aid of a high-precision PCR (for example Pwo system from Roche Biochemlcals, 
procedure as stated by the manufacturer) from an hlL6 cDNA done which can be produced 
from C10-MJ2 cells. The following oligonucleotides are employed for this purpose: 

Oligonucleotide 1 fN-terminaP): 

5'- ATAATTACTA GTTGTGCTCC AGTACCTCCA GGTGAAG -3' 

Oligonucleotide 2 fC-terminaP): 
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s'- ATAATTGGAT CCTCGAGTTA TTACATTTGC CGAAGAGCCC TCAGGC -3 ' 

An Spel cleavage site is introduced at the 5' end. and an Xhol cleavage site is introduced at 
the 3' end via the oligonucleotides used. In addition, a double ocnre stop codon (TAATAA) 
is Introduced at the 3' end of the structural gene for efficient termination of translation. The 
Spel cleavage site at the front end permits ligation in reading frame with the N^-pETI 1a 
vector described above. The Xhol deavage site at the rear end makes directed cloning in 
possible. 

The sequence of the PCR fragment (593 bp) with the structural gene for hlL6 is depicted 
below (read in the N-terminal to C-terminal direction). The restriction cleavage sites are 
underlined, and the first codon of hlL6 (Ala) and the stop codon are printed in bold: 

ATAATTACTACTTGT«rrCCAGTACCTCCAGffi^ 

AGCCACTCACCTCTTCAGAACGAATTGACAAACAAATTCGGTACATCCTCGACGGCATCTCAGCCCT 

GAGAAAGGAGACATGTAACAAGAGTAACATGTGTGAAAGCAGCAAAGAGGCACTGGCAGAAAACAAC 
CTGAACCTTCCAAAGATGGCTGAAAAAGATGGAT^ 

TGGTAAAAATCATCACTGGTCTTTTGGAGTTTGAGGTATACCTAGAGTACCTCCAGAACAGATTTGA 
GAGTAGTGAGGAACAAGCCAGAGCTGTGCAGATGAGTACAAAAGTCCTGATCCAGTTCCTGCAGAAA 
AAGGCAAAGAATCTAGATGCAATAACCACCCCTGACCCAACCACAAATGCCAGCCTGCTGACGAAGC 
TGCAGGCACAGAACCAGTGGCTGCAGGACATGACAACTCATCTCATTCTGCGCAGCTTTAAGGAGTT 
CCTGCAGTCCAGCCTGAGGGCTCTTCGGCAAATGTAATAACTCGAGGATCCAATTAT 

The construct produced by the ligation with the N"°-pET1 1a plasmld is called NP6-pET. 

The sequence of the N-»-hlL6 fusion protein (347 amino acids, of which 162 amino acids 
for the N*" 0 portion and 185 amino acids for the hlL6 portion), encoded on NP6-pET is 
depicted below, with the hlL6 sequence being printed in bold: 



MASHHHHHHHPVGVEEPVYDTAGRPLFGOTSBVHPQSTLKLPHDRGRGDIRTTIJ^LPRKGDCRSGN 

H^FVSGIYIKPGPVYYQDYTGPVYHRAPI^FDEAQFCEVTKRIGRVTGSTCKLyHIYVCn^GCIL 
LKLAKRGTPRTLKWIRNFTNCPLWVTSCAFVPP( 
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The fusion protein has an M,of 39.303.76 d in the reduced state, and after a possible 
cleavage the N"" portion (reduced) would have an M, of 18.338.34 d and the hlL6 portion 
(reduced) would have 20.983.63 d. N*™ has six cysteines and hlL6 four. It Is likely that these 
cysteines are tor the most part in reduced form In the bacterial cytoplasm. During the 
subsequent processing there is presumably at least partial formation of disulphlde bridges. 
It must be expected that the N-terminal methionine in the fusion protein (or in the If" 
portion) is mostly cleaved by the methionine aminopeptidase (MAP) intrinsic to the host, 
which would reduce the M, by about 131 d In each case to 39.172.76 d (fusion protein) and 
18,207.13 d (tf*°). 

The Escherichia co// host strain for expressing the N^-hll^ fusion protein is Escherichia coli 
BL21(DE3) (see Example 1). 

The expression strain BL21(DE3)[MP6-pET] is produced by transforming the expression 
plasmid MP6-pET described above into BL21 (DE3) as described in Example 1 . 

The strain BL21(DE3)[MP6-pETJ is subcultured from a single colony on an agar plate, which 
is then used to inoculate a preculture in Luria Broth + 100 mg/l ampicillin (200 ml in a 1 1 
baffle flask). The preculture is shaken at 250 rpm and 30°C for 14 h and reaches an OD^o 
of about 1 .0 during this. Then 1 0 ml portions of preculture are used to inoculate the main 
cultures (330 ml of Luria Broth in each 1 1 baffle flask) (3% inoculum). The main cultures are 
run at 30°C (250 rpm) until the OD«» has increased to 0.8, and then production of the 
fusion protein is induced with 0.5 or 1.0 mM IPTQ (final concentration). The cultures are 
cultivated further at 30°C and 250 rpm for 3 h, the ODeoo reaching about 1 .0 to 2.0. 

The cultures are transferred into sterile 500 ml centrifuge bottles and centrifuged at 
10,000 g for 30 min. The centrifugation supernatant is completely discarded and the pellets 
are frozen at -80*0 until processed further. 

The appearance of new protein bands in the complete lysate can easily be detected by 
Coomassie staining after SDS-PAGE. Bands with apparent molecular masses of about 
19 kd. 21 kd and 40 kd appear in the lysate of BL21(DE3)[MP6-pET]. Analyses of this 
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expression using specific anti-hlL6 antibodies essentially confirm the result obtained after 
Coomassie staining. 

To optimize the N^-hlLS cleavage, inductions are carried out at various temperatures and 
IPTG concentrations and again analysed both in the stained gel and by a Western blot 
Almost complete cleavage of fsTMLe is observed at a culture temperature of 22°C. 

This experiment shows that heterologous proteins can also be fused to the C-terminus of 
N** 0 in a bacterial expression system, and very efficient cleavage takes place. A change in 
the N-terminal amino acid of the following protein (alanine in place of serine) has no 
adverse effects either. This system is accordingly suitable according to the Invention for 
producing recombinant proteins with homogeneous authentic N-terminus, especially in a 
heterologous expression system such as a bacterial expression system, without further 
processing steps. 

EXAMPLE 3: Expression and in-vivo cleavage of a fusion proteii 
human interferon a2B (IFNa2B) to produce homo geneous mature IFNcx2B 

The way of cloning IFNa2B to produce the vector NPl-pET corresponds to the way 
described for hlL6 in Example 2. The structural gene is amplified by high-precision PCR (for 
example Pwo system from Roche Biochemicals, procedure as stated by the manufacturer). 
The template used is an IFNa2B-cDNA clone which can be produced from human 
leukocytes by standard methods known to the skilled person. An alternative possibility is 
also to carry out a complete synthesis of the gene. The sequence of the structural gene is 
obtainable in electronic form via the Genbank database under accession number V00548. 
The following oligonucleotides are employed for the amplification: 

Oligonucleotide 1 ("N-terminaO: 

5'- ATAATTACTA GTTGTTGTGA TCTGCCTCAA ACCCACAGCC -3' 

Oligonucleotide 2 ("C-terminal"): 

5'- ATAATTGGAT CCTCGAGTTA TTATTCCTTA CTTCTTAAAC TTTCTTGCAA G -3 ' 
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The sequence of the PCR fragment (533 bp) with the structural gene for IFNa2B is depicted 
below. The restriction cleavage sites are underlined, and the first codon of IFNa2B (Cys) 
and the stop codon are printed in bold: 

RTAATT ACTAGT TGTTGTGATCT<^CTCAAACCCACAGCCTGGGTAGCAGGAGGACCTTGATGCTCC 
TGGC^CAGATGAGGAGAATCTCTCTTTTCTCCTGCT^ 

GGAGGAGTTTGGCAACCAGTTCCAAAAGGCTGAAACCATCCCTGTCCTCCATGAGATGATCCAGCAG 
ATCTTCAATCTCTTCAGCACAAAGGACTCATCTGCTGCTTGGGATGAGACC 

ACACTGAACTCTACCAGCAGCTGAATGACCTGGAAGCCTGTGTGATACAGGGGGTGGGGGTGACAGA 
GACTCCCCTGATGAAGGAGGACTCCATTCTGGCTGTGAGGAAATACTTCCAAAGAATCACTCTCTAT 
CTGAAAGAGAAGAAATACAGCCCTTGTGCCTGGGAGGTTGTCAGAGCAGAAATCATGAGATCT^ 
CTTTGTCAACAAACTTGCAAGAAAGTTTAAGAAGTAAGGAATAATAACTCGAGGATCCAATTAT 

The construct produced by ligation to the N^-pETl 1 a plasmid is called NPI-pET. 

The sequence of the N^-IFNofiB fusion protein (327 aa. of which 162 tf° and 165 
IFNoSB) encoded on NPI-pET is depicted below, with the IFNo2B sequence being printed 
in bold (depicted in the direction from the N-terminus to the C-terminus): 

^SHHHHHHHPVGVEEPVYDTAGRPLFGNPSEVHPQSTLKLPEDRGRGDIRTTLRDLPRKGDCRSGN 
HLGPVSGIYIKPGPVr^QDYTGPVYHRAPLEFFT)EAQFCEVTKRIGRVTGSIX»KLYHIYVCVDGCILi 

LKLAKRGT PRTLKW I RNrTNC PLWVT S C C DLP QTHS LG 3 RRTUCLAQMRRI SI»F S CLXDRHDF OTP 
QEEFGNQFQFJUJTIFVIJnMIQyliraFSTTOSSAAOT^ 

ETPIJIKKDSirJWTlKYPQRITLYIJCEKKYSPCAWEVVRASIKRSrSLSTlJLQKSLRSKE 

The fusion protein has an M, of 37.591 .44 d in the reduced state, and after a possible 
cleavage the NP° portion (reduced) would have an M, of 18.338.34 d and the IFNo2B 
portion (reduced) would have 19,271 .09 d. tf" has six cysteines and IFNa2B four. It is 
likely that these cysteines are for the most part in reduced form in the bacterial cytoplasm. 
During the subsequent processing there is presumably at least partial formation of 
disulphide bridges. It must be expected that the N-terminal methionine in the fusion protein 
(or in the N"™ portion) is mostly cleaved by the methionine aminopeptidase (MAP) intrinsic to 
the host, which would reduce the M r by about 131 d in each case to 37,460.23 d (fusion 
protein) and 18.207.1 3 d (N pn> ). 
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The Escherichia coll host strain for expressing the N^-IFNcfiB fusion protein is Escherichia 
coff BL21(DE3) (see Example 1). 

The expression strain BL21 (DE3)[NPI-pET] is produced by transforming the expression 
plasmid NPI-pET described above into BL21(DE3) as described in Example 1. 

The strain BL21(DE3)[NPI-pETl is subcultured from a single colony on an agar plate, and 
this Is used to inoculate a preculture in Luria broth + 100 mg/l ampicilDn (200 ml in a 1 1 
baffle flask). The preculture is shaken at 250 rpm and 30"C for 14 h and reaches an OD«o 
of about 1 .0 during this. 10 ml portions of preculture are then used to inoculate the main 
cultures (330 ml of Luria broth in each 1 I baffle flask) (3% inoculum). The main cultures are 
run at 30'C (250 rpm) until the ODeoo has increased to 0.8. and then production of the 
fusion protein is induced with 0.5 or 1 .0 mM IPTG (final concentration). The cultures are 
cultivated further at 30°C and 250 rpm for 3 h, during which an OD«o of about 1 .0 to 2.0 is 
reached. 

The cultures are transferred into sterile 500 ml centrifuge bottles and centrifuged at 
10,000 g for 30 min. The centrifugation supernatant is completely discarded, and the pellets 
are frozen at -80°C until processed further. 

The appearance of new protein bands in the complete lysate can easily be detected by 
Coomassie staining after SDS-PAGE. Molecular masses of about 38 kd and about 19 kd 
appear in the lysate of BL21(DE3)[MP6-pET]. The IFNo2B band cannot be separated from 
the NF™ band by SDS-PAGE. 

Analyses of these samples using specific anti-IFNa2B antibodies confirm the presence of a 
cleaved IFNa2B band. 

To optimize the N^-IFNcEB cleavage, inductions are carried out at various temperatures 
and IPTG concentrations in this case too. and again analysed both in the stained gel and by 
a Western blot. It is also found m this case that optimal cleavage takes place at reduced 
temperatures (22 to 30°C). 
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What is claimed is: 

1 . A nucleic acid molecule coding for a fusion protein comprising a first polypeptide which 
has the autoproteotytic function of an autoprotease NT of a pestivirus, and a second 
polypeptide which is connected to the first polypeptide at the C-tenninus of the first 
polypeptide in a manner such that the second polypeptide is capable of being cleaved from 
the fusion protein by the autoproteolytic activity of the first polypeptide, and where the 
second polypeptide is a heterologous polypeptide. ' ( 

2. A nucleic acid molecule according to claim 1. wherein the pestivirus is selected from the 
group of CSFV, BDV and BVDV. 

3. A nucleic acid molecule according to daim 2, wherein the pestivirus is CSFV. 

4. A nucleic acid molecule according to daim 3, wherein the first polypeptide comprises the 
following amino acid sequence: 

( 1 ) -MELNHFELLYXTSKQKPVGVEEPVYOTAGBPL 
PRKGDCRSGNHI/SPVSGIYIKPGFvTfYQDYTC 
IYVCVDGCILLKIJUaiGTPRTIJCWIimFTNCPLWvTSC-(168) , 

or the amino acid sequence of a derivative thereof with autoproteolytic activity. 

5. A nudeic acid molecule according to daim 3, wherein the first polypeptide comprises the 
amino acid sequence Glu22 to Cys1 68 of the autoprotease of CSFV or a derivative 
thereof with autoproteolytic activity, wherein the first polypeptide additionally has a Met as 
N-terminus, and wherein the heterologous polypeptide is conneded directly to the amino 
acid Cys168 of the autoprotease N p, ° of CSFV. 

6. A nucleic add molecule according to daim 3, wherein the first polypeptide comprises the 
amino acid sequence Pro17 to Cys168 of the autoprotease N* 0 of CSFV or a derivative 
thereof with autoproteolytic activity, wherein the first polypeptide additionally has a Met as 
N-terminus, and wherein the heterologous polypeptide is connected directly to the amino 
add Cys168 of the autoprotease hT* of CSFV. 
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